Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Design of mixed data clustering algorithm based on density peak
LI Ye, CHEN Yiyan, ZHANG Shufen
Journal of Computer Applications    2018, 38 (2): 483-490.   DOI: 10.11772/j.issn.1001-9081.2017082053
Abstract343)      PDF (1493KB)(335)       Save
Focusing on the issue that k-prototypes algorithm is incapable of identifying automatically the number of clusters and discovering clusters with arbitrary shape, a mixed data clustering algorithm based on searching for density peaks was proposed. Firstly, CFSFDP (Clustering by fast Search and Find of Density Peaks) clustering algorithm was extended to mixed datasets in which the distances between mixed data objects were calculated to determine the cluster centers by using CFSFDP algorithm, that is, the number of clusters was determined automatically. The rest points were then assigned to the cluster in order of their density from large to small. Secondly, the selection method of threshold and weight in the proposed algorithm was introduced. In the density formula, the threshold (cutoff distance) was extracted automatically by calculating potential entropy of data field; in the distance formula, the weight was defined through certain statistic which can measure clustering tendency of numeric datasets and categorical datasets. Finally, experimental results on three real mixed datasets show that compared with k-prototypes algorithm, the proposed algorithm can effectively improve the accuracy of clustering.
Reference | Related Articles | Metrics